Abstract
Background:
Prolonged exposure to thiopurine drugs is a crucial component of curative therapy for acute lymphoblastic leukemia (ALL). However, these medications are also known to cause significant hematological toxicity, namely myelosuppression, often leading to an increased risk of severe infection. Polymorphisms in genes involved in thiopurine metabolism have been linked to inter-patient variability in these drug toxicity phenotypes. For example, loss-of-function coding variants in thiopurine S-methyltransferase (TPMT) predispose patients to thiopurine-induced myelosuppression, making TPMT-guided drug dosing one of the first clinically implemented pharmacogenetic associations in oncology.
In addition to coding variants, a variable number tandem repeat (VNTR) in the promoter of TPMT has been suggested to influence transcription in cis and affect TPMT enzymatic activity. However, prior studies were limited by small sample sizes and minimal population diversity. Therefore, we sought to perform a large comprehensive analysis to conclusively define the role of TPMT-VNTR and its potential pharmacogenetic importance.
Method:
We collected Illumina short-read whole genome sequencing data encompassing a diverse population of 3,197 individuals from the 1000 Genomes Project, of whom 956 samples also had Nanopore long-read sequencing. Using this dataset, we developed a short-read sequencing-based VNTR calling algorithm to extract TPMT-VNTR genotypes. TPMT gene expression from EBV-transformed lymphoblastoid cell lines was available in 449 subjects of the 1000 Genomes cohort by RNA-seq. We then used our VNTR calling algorithm to analyze short-read sequencing results for 457 children with newly-diagnosed ALL enrolled on either the St. Jude Total Therapy XVI or XVII Study. TPMT enzymatic activity was measured in 454 of these patients. Of these, 369 cases also had 6-MP metabolites quantified using liquid chromatography-mass spectrometry, including for thioguanine nucleotides (TGNs) and methylmercaptopurine nucleotides (MMPNs). A linear mixed effects model was used to assess the association between VNTR and TPMT activity or TGN/MMPN ratio.
Results:
To evaluate the accuracy of our short-read VNTR calling algorithm, we compared returned genotypes to long-read derived references in the 1000 Genomes, which led to an estimate of 98.8% concordance.
In the 1000 Genomes cohort, we observed that VNTR alleles ranged in size from three to ten repeats, containing several combinations of three motifs, A, B, and C. A2BC and A2B2C were the most common alleles with population frequencies of 44.6% and 29.8%, respectively. VNTR alleles were most diverse in populations of African ancestry, with a greater proportion of subjects with high repeats (13.7%, compared to 4.69% in all other populations combined). Additionally, across different ancestry groups, we observed a strong linkage between known TPMT coding variants and VNTR. More specifically, VNTR variants with one or rarely two A repeats were in linkage with TPMT *3C and *3A, but only when there were two or more B repeats in the same allele.
To avoid confounding effects, we excluded patients carrying any of the known coding TPMT variants in subsequent analyses. The total number of VNTR repeats per individual was inversely correlated with TPMT gene expression in the 1000 Genomes cohort (n = 449, coefficient = -0.529, p = 0.033). Similarly, the total number of repeats was also negatively associated with TPMT activity in St. Jude ALL patients (n = 454, coef = -0.353, p = 0.0014). Additionally, a higher total number of repeats was significantly associated with a higher TGN/MMPN ratio in ALL patients (n = 369, coef = 0.115, p = 0.011), consistent with lower TPMT activity. Across these three phenotypes (expression, activity, metabolite ratio), the associations with total repeat count are likely to be driven by the number of A repeats (coef = -0.725, -0.364, 0.118, and p = 0.017, 0.017, 0.052, respectively).
Analyses associating VNTR with thiopurine toxicities (myelosuppression and liver function) using the St. Jude ALL cohort are ongoing.
Conclusion:
Studying whole genome sequencing in large reference populations and ALL patients, we have demonstrated that the TPMT-VNTR impacts TPMT transcription and enzymatic function in addition to coding variants in this gene. This work highlights the importance of noncoding variants in modulating drug efficacy and toxicity in hematological malignancies.